Phys. Rev. E 77, 016110 (2008) 



Transient dynamics for sequence processing neural networks: effect of degree 

distributions 

Yong Chen, 1 ' 2 '^ Pan Zhang, 2 '0 Lianchun Yu, 2 and Shengli Zhang 1 

1 Research Center for Science, Xi'an Jiaotong University, Xi'an 710049, China 
Institute of Theoretical Physics, Lanzhou University, Lanzhou 730000, China 
(Dated: February 1, 2008) 

We derive an analytic evolution equation for overlap parameters, including the effect of degree 
distribution on the transient dynamics of sequence processing neural networks. In the special case 
of globally coupled networks, the precisely retrieved critical loading ratio ct c = 7V~ 1,/2 is obtained, 
where N is the network size. In the presence of random networks, our theoretical predictions 
agree quantitatively with the numerical experiments for delta, binomial, and power-law degree 
distributions. 



o 
o 



> 



PACS numbers: 87.10.+e, 89.75.Fb, 87.18. Sn, 02.50.-r 



I. INTRODUCTION 

Recently, structure and dynamics in complex networks have attracted considerable attention and have been inves- 
tigated in a large variety of research fields In particular, an important topic is whether the structure of neural 
' wiring is related to brain functions. 

Starting from pioneering milestone works that modeled Ising spin for neural networks, a large body of research has 
made a significant contribution to our understanding of parallel information processing in nervous tissue @, 0, HI- 
The equilibrium properties of the Hopfield model in a fully connected topology with the typical Hebbian prescription 
I i for the interaction strengths have been successfully described by a replica method [!, H[ . The dynamics of the fully 
connected Hopfield model with static patterns and sequence patterns have been widely studied using generating 
functional analysis [1, @ and signal-to-noise analysis @, II, Q • 

In recent years, there have been a large number of numerical studies of the Hopfield model on the complex structure, 
focusing on how the topology of a network, the degree distribution in particular, affects the computational performance 
of the formation of associative memorie s fiol . [Til . Il2l [T3L IT3 |. Various random diluted models have been studied, 
including the extreme diluted model [l5|, Il6j . the finite diluted model (T7L Il8j , and the finite connection model [l!| . 
However, to the best of our knowledge, here we derive for the first time the equation of retrieval dynamics for sequence 
processing neural networks with complex network topology. 

In this paper, we study a modification of the Hopfield network, known as the sequence processing model, which 
£f~) ' acts as a temporary associative memory model [2(| [2l|, [22|, HH. This model is very important to understand how 
ly-^ , the nervous system allows the learning of behavioral sequences because it requires hundreds of transitions that need 
to be precisely stored in neuronal connections [24j |. The asymmetry of the interaction matrix rules out equilibrium 
[ statistical mechanical methods of analysis, including conventional replica theory. The goal of this work is to study 
the effect of the degree distribution on the transient dynamics of the sequence processing neural network. Using 
a probability approach, we derive an analytic time evolution equation for the overlap parameter with an arbitrary 
degree distribution that is consistent with our extensive numerical simulation results. 

This paper is organized as follows. In Section [H] we introduce the definition of the sequence processing neural 
networks. The time evolution equations of the order parameters for the effects of degree distribution are derived and 
discussed in Section Hill Section [TVl contains the comparison of theoretical results with numerical simulations. Finally, 
Section IVl presents a summary and the concluding remarks. 



II. MODEL DEFINITION 



In this paper, we consider a general version of sequence processing neural networks with parallel dynamics. The 
model consists of N Ising spin neurons s, G { — 1, 1}. If the neuron i is at exciting status we put Si = 1; otherwise 
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FIG. 1: (Color online) Temporal evolution of the macroscopic overlap parameters m M with respect to time t in the case of 
w lj = 1. (a) TV = 100, p = 5, and Ap = 1; (b) TV = 200, p = 10, and Ap = 2. 



(neuron i is inhibiting), we put Si = —1. The embedded patterns are p states of the systems £f G {1,— l}(/i = 
1, 2, . . . ,p; i = 1, 2, . . . , TV). The patterns are random so that each takes the values ±1 with equal probability. The 
couplings between neurons are represented by the following form, 

J = ^£tf +Ap ^# (M:modp), (1) 

where Wij £ {0, 1} is a matrix element to tune the connection topology of coupling matrix J and Ap = 0, 1, 2, . . . , TV— 1 
describes the patterns learned as dynamic objects. This definition is clearly a typical asymmetric neural network. In 
the special case of twy = 1 and Ap = 0, the synapses are symmetric as in the Hopficld-Hcbb networks. 

The evolution dynamics of the systems are restricted to deterministic parallel dynamics where the spins are updated 
simultaneously according to 

s i (t+l)=sgn(j2j ij s j (t) ) j , (2) 

where sgn(x) is the sign function. The time step is set to 1 in all our work. 

In order to analyze the retrieval dynamics, the macroscopic overlap order parameter at time t is defined by 

1 N 1\ tl (t) 
mM W = n = ~N - M = 1,2, . . • ,p. (3) 

i=l 

Here, A M (i) is the number of spins which have the same sign between the network state at time t and the given 
pattern In Fig. [TJ we present the evolutionary processes of the order parameter for the typical sequence 
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processing model with a global connection. It is found that the stored memories are reconstructed as a period-(p/Ap) 
cycle: 2^3^4^5^1^2^-- - in Fig. Ha), and 4 -> 6 -> 8 -> 10 -> 2 -> 4 -> • • • in Fig. QTb). To simplify 
the recent study, Ap = 1 in the following context. Note that the evolution equation for overlap parameters obtained 
with Ap — 1 is also suitable for any other Ap values. 



III. TRANSIENT DYNAMICS AND MACROSCOPIC OBSERVABLE 



Considering Eq. (JTJ) and Eq. ©, one can get the following one-step update process, 



si (*+i) = sgn (it (j? E s > (*) j = s § n E E w ) 

= sgn ( E w + E E «"«^ (*) I . ( 4 ) 

\ 3 Hi^v 3 ) 

where £ y is the i/th stored pattern that is closest to state s(t) or m u (t) = maxim 1 , m 2 , . . . ,m p }. The contributions 
for a single step evolutionary process in Eq. ([4]) consists of two parts denoted by 



hUt) = E W^E-^w) 



(6) 



The first part in the update function of Eq. h\ it), drives the status of the ith spin to at time t + 1. The 
other part, h 2 (t), is the noise term. In the case of absolutely stable and precise retrieving storage, Sj (t + 1) = . 
Since Si (i + 1) = sgn + h 2 (t)) and /if cannot change the sign of /i*. If = +1, we have h\ > 0. To ensure 

that Sl {t+l) = er + \ h 2 should satisfy ft? > -hj, so ftf/ftj > -1. If = -1, hj < 0, we also have h 2 Jh\ > -1. 
So the probability of s,(i + 1) = is represented by the following equation, 

in which P(zi(t)) is the probability of Zi(t) — h 2 {t)/h\(t). Note that the degree of node i is fcj = . =1 w^, which 
means that there are only fcj spins that are affected. For = 1, 

h\ (t) = ^m". (8) 

Applying the above equation to Eq. |(7J), we have 

(p-i)& 

p( Si (t+i) = er l ) = 2 p (^ 2 (*))' (er +1 = i) (9) 

m"ki/N 

J2 p(hm), (er +1 = -i)- (io) 

h?=-{p-l)ki/N 

With the following definition 

mo = >; I ~>:"-^s-" ! i ■ 1111 



E f^E^-iW) 
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We find that 

(P-1)# / (P-I)-W- m "7T \ 

E p (K(t)) = l\ E E p (^) ■ (i2) 

h 'i=- m "w \ h l=- m "w h f=-(p~ 1 )w ) 

Combined with Eqs. ([MTU]), we have 

(p-l)k z /N 

P(s l (t + 1)=^ +1 ) = E P iti&))- (13) 

Given the fact that the stored patterns are random and independent, the probability of the total number A^- spins 
that have has the same sign between s(t) and P(\^), is given by 

P(A£) = Cf2- N . (14) 

where C N N is used to denote a binomial coefficient, i.e. C N N — \m ujv-\^ m • For ^ ne precisely retrieved case (when 

N '\ N' - 

system can retrieve patterns without error), s(t) « £ u , equation (fT4|) is correct obviously. Without the precisely 
retrieved condition, Eq. (fl"4"|) in fact neglects the correlations between network states and pattern fi. If network 
topology has a local tree structure, the above deduction is correct. If there exist many short loops in the networks, 
the above deriving process is just an approximation. So, in this work, our derivations below are only suitable for 
sparsely connected random network, where the typical loop length is about log^ N and k is average connection. In 
this paper, we use the condition N — ► oo, k — > oo, and k/N — ► 0. 
Then from Eq. ([3]), we find 



|-1 UpU^-1|=/'[.\'1i. (ir„ 



As noticed in the above statements [see Eqs. (jl])-©], the local field is filtered by the topological structure of the 
networks. In this case, the probability of the total number A^. spins that have the same sign between s(t) and £ M is 



P ( tX<*',^,:': = V = | - l) ] -P(K) = C£f.2-* (16) 



N ^ J 3 N N 

3 

Substituting into Eq. (pTOj) the expressions of 



2A£ h 



N N 

we get the following form 



(17) 



p \ ££^i(*)=*n = ^^ = 1(1+^=^^^. as) 

The other form of /i^ (t) is readily deduced from the Eq. (fTT|) which can be written as 

hjw = E<= 2E y A "* -(p-i)§- as) 

Therefore, after coarse graining like in Eqs. (|161 IT5|) . using the definition of = £^j„ = ~T L + ^-jj*-) W e 
obtain that 

(p-l)k z /N (p-l)fci / p 

p 1 = p ( Si (t + 1) = er +1 ) = E p W(*)) = E p = E A t 



(p-i)/!* 



E C-i) fc 2_(p " )fei = E C-i) fe 2 " (p " 1)fei - ( 2 °) 
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By introducing the degree distribution of the network structure P(k), where k is the number of links connected to 
a node, the total number A" +1 spins between the status s(t + 1) and the stored pattern can be expressed as 

y+1 = E Np i p ( k )- ( 21 ) 

k 

Then from Eq. ((3]), we obtain the macroscopic observable 

m^ 1 (t+l) = i^ % +1 Si (t + l) = J2 2P i P ( fc ) - L ( 22 ) 



Combining Eq. (|20[) and Eq. (|22[) and replacing Afc ( by n, we arrive at the evolution equation of the overlap parameter 

„i(p-l)fe+im"(t)fc „„ 

m^(t + 1) = 2 x 2 P(fc) 2 — 2(p _ 1)fc (? " 1)fc - 1. (23) 

A: 

In the case of successful storage, the network finally tends to converge into a stable periodic cycle, or m v (i) = 
m lJ+1 (t + 1). Finally, replacing m v {t) — m l/+1 (t + 1) by m/, one has the iterative solution for the final overlap 
parameter, 

„i(p-l)fe+im/fc ™ 

-/ = ^ X 2(p _ 1)fc ^ - 1. (24) 

k 

Note that it is difficult to calculate the above iterative equation of the overlap parameter at very large (p— l)fc due 
to the computational complexity of the factorial term. A reasonable solution is to replace the binomial distribution in 
Eq. (|20[) by a Gaussian distribution with the same expectation value and the same standard deviation. As a result, 
replacing n by x, we reformulate Eq. (f2"3")l as 

|(p-l+m"(t)) i — 



fe x=0 » ^ ; 



fe 

Substituting into the above equation the following expression, 

x-(p- l)k/2 



,f(p-lW(t)) I ^ - 2 (— T 1 ') 2 



y/(p-l)k/2 ' 



(26) 



we find Eq. (|25|) under the form 



m" +1 (i + l) = E 2P ( fc )\/^ / 
u V 2tt J 



'(*) 



m v (t) 



5>(fc)erf 

£P(*)er f fe§V (27) 



f^/k 



where erf (•) is the error integral function 

erf (z) = ^ J exp(-x 2 /2)dx. 
Note that in the special case of fully connected networks, k = N and P(k) = S(k — N). Herein, Eq. (|2T[> is reduced to 

"•"'" +1) = erf (5i)- (28 > 

which is a well-known result 0, [25[ . It is interesting to compare the above equation with our result for arbitrary 
degree distributions. In Eq. (j27|) . there is only a little modification. It should also be mentioned that Eq. (|27|) is also 
found for sparsely connected Hopfield networks (26J. 
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FIG. 2: (Color online) The final overlap m/ in the globally coupled network with N = 100,200. (a) (o, x): iterative results 
from Eq. (|23[) . (□,+): simulation results. The solid line is m/ = 0.99 which corresponds to a ~ 0.15 from simulations and 
iterative results. The inset is the same result for abscissa p. (b) The iterative error Am/ versus the loading ratio a. Each 
simulation point represents an average of 200 trials. 



IV. NUMERICAL STUDIES 



To verify the theory, we performed extensive simulations which are reported in this section. First of all, we give a 
succinct description of how our calculations for the iterative Eq. ([24]) were made. Note that a precondition of our 
work is that the network is capable of memorizing patterns in the form of its equilibria in each trial [see Eqs. ([7J 
[2T])]. The initial overlap mj in the right side of the iterative equation is set to 1.0. Then the second is obtained 
and is set as the initial overlap to calculate the third one. This process is repeated until mJj ss rn n ^ +1 within allowed 
precision. Thus, we arrive at the final stable macroscopic overlap parameter rrif = mj after n iterative steps. In our 
simulations, it is found that the iterative procedures always converge quite rapidly, stopping after 4 — 5 steps at most. 

In order to compare the different effects of various degree distributions on the performance of neural networks, we 
consider the following two cases. One case is the globally coupled network. Although this case appears somewhat 
trivial, it is helpful to compare our study with some common conclusions. The other case is the random network with 
various degree distributions, including the delta function, binomial, and power-law distributions. 



A. Globally coupled networks 



In this case, all the neuronal spins are connected with each other at any time, namely Wij — 1, and the degree 
distribution P(fc) is a 5-function. This actually introduces a huge waste of energy but provides a neural network with 
the maximal retrieving performance. A large number of numerical and theoretical studies on this network dynamics 
have been made in the last two decades. These studies revealed that there exists a critical loading ratio known as 
the Amit-Gutfreund-Sompolinsky (AGS) value «i ~ 0.139 for symmetric networks ||, a saturated stored capacity 
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FIG. 3: (Color online) The critical precise loading ratio a c = Pc/N for globally coupled networks. The solid line is a c — 1/\N. 
(o): Simulation results. Each point represents an average of 200 trials, (x): Iterative results from Eq. (|24p with precision 
Am < 10" 4 . (□): results from the Amari theory, Eq. (3.4) in 0. 

a s « 0.269 for sequence processing networks @, and the so-called exactly memorized capacity in Q. 

It is found that the iterative form Eq. (|24[) for the final overlap m/ is effective in most successfully retrieved cases 
with negligible error. This is evident in Fig. [2j showing that m/ for N — 100, 200 until the saturation of the loading 
ratio a. The iterative results are almost the same as the simulation results when rrif is very close to 1. For example, 
am 0.15 for Arrif = 0.001 and a w 0.20 for Am/ = 0.002. When the final overlap parameters m/ deviate from 1, 
as the loading ratio a increases, the iterative error increases sharply. However, the final overlap parameter has only 
a small and acceptable error Am/ < 0.04 near the saturation a s = 0.269 [see Fig. Efb)]. As stated above in Eq. {7}, 
we study only the first step behavior of the macroscopic overlap parameters. In order to get more precise results, the 
signal-to-noise analysis for the first few time steps should be considered @, H3| • 

One may be interested in the case where systems can retrieve stored patterns without error. We define this as the 
precisely retrieved case. In other words, pattern v at time t and pattern v + 1 at time t+1 are retrieved without error, 
which means m v (t) — 1, and m I/+1 (t+1) = 1. In the stationary state, m/ = 1. In our theory, the critical precisely 
retrieved storage a c can be calculated from Eq. (|2"4"]) by setting m/ = 1. In Fig. [H we present a c obtained by Eq. 
([24| to compare with the simulation results. Obviously, a c w 1/%/TV in the iterative algorithm Eq. (f2"4|) is consistent 
with that in the simulation results. Here, it should be mentioned that there is another capacity definition based 
on absolute stability, a a = 1/(2 log TV — log log N), called the Amari capacity 0]. Obviously, one has the following 
relationship between the critical loading ratios, 

a c < a A < ot s . (29) 

Note that the Amari capacity aA means that there exists some probability for precise retrieval of stored patterns at 
least one time in a large number of trials. However, the so-called precisely retrieved capacity a c in this paper means 
that the systems must precisely retrieve stored patterns for each trial. 

B. Random networks 

We take more general situations into consideration and explore the effect of degree distributions on the transient 
dynamics of neural networks in the case of random connection. In the following context, we study three situations: 
the delta function, binomial, and power-law degree distributions. 

In Fig. 31 we plot the temporal evolution of overlaps for the above three types of degree distributions obtained from 
both the theory [Eq. (|27p] and simulations. The parameters are: N — 50000, the average degree k = 100, and the 
number of stored patterns p = 20. The first numerical experiment is the degree distribution with the delta function 



P(k) = S(k-k). 



(30) 
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FIG. 4: (Color online) The temporal evolution of overlaps from our theory (— ) and simulation (*) for (a) delta, (b) binomial, 
and (c) power-law degree distributions. The parameters of networks are N = 50000, p = 20, and the average degree k — 100. 



This connection topology is generated by randomizing a regular lattice whose average degree is k. The temporal 
evolutions of overlap parameters from the theory and numerical simulations are plotted in Fig. 0]^a) . The second one 
is a binomial distribution which comes from an Erdos-Renyi random graph [28| [see the inset of Fig. Hfb)] 

P(*)-<*(I)'(l-I) ,M . (31) 

The third one is the power-law distribution [see the inset of Fig. 0Jc)] , 

P(k) - k~ 3 . (32) 

The power- law degree distribution can be generated using preferential attachment [2!| . It is easy to observe that the 
theoretical results from our scheme are consistent with the simulations for the three degree distributions above. 

Note that the presented cases are all situations of successful retrieval of the stored patterns. Fig. [S] plots the time 
evolution of overlaps in the case of failed trials with p = 60. Apparently, as stated above, encouraging results are also 
obtained. 

Furthermore, we study the effect of size in the network on transient dynamics. Figure [5] shows the comparison 
between our theory [Eq. J27[) ] and the numerical simulations for TV = 10000 and N — 50000 in the case of binomial 
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FIG. 5: (Color online) The temporal evolution of overlaps from our theory (— ) and simulation (*) for power-law degree 



distribution P(k) 



The parameters of networks are N = 50000, p = 60, and the average degree k = 100. 
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FIG. 6: (Color online) The temporal evolution of overlaps for binomial degree distribution with the average degree k = 100, 
p = 20, and (a) N = 10000, (b) N = 50000. 
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degree distribution. As N increases under all the other same parameters, the theoretical prediction is closer to the 
simulation result. In fact, this size effect comes from the loop structure in networks. In this paper, our equation Q27[) 
does not take into account the loop structure. Loop structure refers to the existence of (perhaps many) short loops 
in the network, such as triangles or quadrangles (See Ref. [13]). These short loops may cause the coupling of the 
order parameters at different times and complicate the dynamics. [30| suggested a parameter, loopiness coefficient, to 
investigate the effect of loop in the networks. Loopiness coefficients grow with k/N in random network. Our formula 
can present better performance for loopiness coefficients smaller with increasing sparseness of network connectivity. 



V. SUMMARY AND CONCLUDING REMARKS 



In this paper, we have discussed the effect of degree distribution on transient dynamics for sequence processing 
neural networks. When the effect of loop structure is absent, we derived the analytic evolution equation for the 
overlap parameter [Eq. ([27)) ] including the effect of degree distribution that is also obtained in the sparsely connected 
Hopfield model [20]. In the case of globally coupled networks, the so-called precisely retrieved capacity a c = N~ x /' 2 is 
suggested by both the theory and simulations; whereas in the case of random networks, our theoretical predictions are 
consistent with the numerical simulation results under three situations, including the delta, binomial, and power-law 
degree distributions. 

It should be mentioned that, in our presented work, the most efficient arrangement for storage and retrieval of 
patterns in sequence by the artificial neural network is the random topology. But in real brains, the topology of 
neural systems appears more complicated and the effect of loop structure becomes inevitable [3ll . [32| . In a special 
case, the role of loop structure has been studied without the effect of degree distribution [3fJ. In future work, we will 
focus on how to combine the effects from the degree distribution and the loop structure. 
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